Policy Distillation

نویسندگان

Andrei A. Rusu

Sergio Gomez Colmenarejo

Çaglar Gülçehre

Guillaume Desjardins

James Kirkpatrick

Razvan Pascanu

Volodymyr Mnih

Koray Kavukcuoglu

Raia Hadsell

چکیده

Policies for complex visual tasks have been successfully learned with deep reinforcement learning, using an approach called deep Q-networks (DQN), but relatively large (task-specific) networks and extensive training are needed to achieve good performance. In this work, we present a novel method called policy distillation that can be used to extract the policy of a reinforcement learning agent and train a new network that performs at the expert level while being dramatically smaller and more efficient. Furthermore, the same method can be used to consolidate multiple task-specific policies into a single policy. We demonstrate these claims using the Atari domain and show that the multi-task distilled agent outperforms the single-task teachers as well as a jointly-trained DQN agent.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Knowledge Transfer for Deep Reinforcement Learning with Hierarchical Experience Replay

The process for transferring knowledge of multiple reinforcement learning policies into a single multi-task policy via distillation technique is known as policy distillation. When policy distillation is under a deep reinforcement learning setting, due to the giant parameter size and the huge state space for each task domain, it requires extensive computational efforts to train the multi-task po...

متن کامل

Multi-skilled Motion Control

Deep reinforcement learning has demonstrated increasing capabilities for continuous control problems, including agents that can move with skill and agility through their environment. An open problem in this setting is that of developing good strategies for integrating or merging policies for multiple skills, where each individual skill is a specialist in a specific skill and its associated stat...

متن کامل

Progressive Reinforcement Learning with Distillation for Multi-Skilled Motion Control

متن کامل

Greener Solvent Selection, Solvent Recycling and Optimal Control for Pharmaceutical and Bio-processing Industries

This paper proposes the simultaneous integration of environmentally benign solvent selection (chemical synthesis), solvent recycling (process synthesis) and optimal control for the separation of azeotropic systems using batch distillation. The previous work performed by Kim et al. (2004) combines the chemical synthesis and process synthesis under uncertainty. For batch distillation, optimal ope...

متن کامل

Inferential Estimation for a Ternary Batch Distillation

A Kalman filter (KF) estimator has been formulated using a sequence of reduced-order models representing a whole batch behavior for providing the estimates of dynamic composition in a ternary batch distillation process operated in an optimal-reflux policy. A set of full-order models is firstly obtained by linearizing around different pseudo-steady state operating conditions along batch optimal ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

CoRR

دوره abs/1511.06295 شماره

صفحات -

تاریخ انتشار 2015

Policy Distillation

نویسندگان

چکیده

منابع مشابه

Knowledge Transfer for Deep Reinforcement Learning with Hierarchical Experience Replay

Multi-skilled Motion Control

Progressive Reinforcement Learning with Distillation for Multi-Skilled Motion Control

Greener Solvent Selection, Solvent Recycling and Optimal Control for Pharmaceutical and Bio-processing Industries

Inferential Estimation for a Ternary Batch Distillation

عنوان ژورنال:

اشتراک گذاری